Taxonomy of principal distances 


Hamming distance 
Euclidean geometry ; (fi: p; A a3) 
aes distance 


d,( (p,q =>), l= qi| 
(city vee taxi cab) 


Euclidean distance 
>, (pi — %)? (Pythagoras’ 
theorem circa 500 BC) 
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d2(p, q) = 


Minkowski distance (L,-norm) 


(H. Minkowski 1864-1909) 
Space-time geometry 


Quadratic distance 
dg = \/(p— gq)? Q(p — q) 
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Riemannian metric tensor 


dx; dz; 
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(B. Riemann 1826-1866,) 


Itakura-Saito divergence 


IS(p|q) = 90 ,( 2! — log 2 — 1 
(Burg eno) 


Rényi 


dz = /(p— q)T=1(p— 


Non-Euclidean geometries 
Fisher information (local entropy) 
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In p(X|9)) 7] 
(R. A. Fisher 1890-1962) 


exponential families 


Chernoff divergence (1952 
Ca(pllq) = 


Ho = 


Ra(pla) = 
additive entropy) 


Statistical geometry 


Information entropy 
H(p) = — f plogp 
(C. Shannon 1948) 
"Life 


“negative entropy 


Additive entropy 


cross-entropy 
conditional entropy 
mutual information 

(chain rules) 


Physics entropy JK~! 
—k f Dp log Dp 


Mahalanobis metric (1936) 
q) 


I-projection 
H(p) = KL(p||u) 


Ae 


ae 


Kullback-Leibler a 


KL(p||q) = f plog ® = E,[log 5] 
(relative entropy, 1951) 


Jeffrey divergence 
(Jensen-Shannon) 


Bhattacharya distance (1967) 


cane cas ee 
re) re, 


K(pllq) = f la—pl 


(Kolmogoroy-Smirnoff max |p — q|) 


Matsushita distance (1956) 


Ma(p,q) = */ f la= — pe | 


Hellinger 
J(vP- Va? 
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= Gla 


xa(1l — a) 


7 test 


7 (p\|q) = f (q=p)* 2 
(K. ek. 1857-1936 ) 


divergence (1961) 
1 a 
aa log f f ‘ 

a(a—1) In f p q 


Csiszar’ f- ee 


Neyman 


Kullback-Leibler 


es 


Ds(p\lq) = 


J ef (2) 


Gy A (Ali& Silvey 1966, be 1967) 


Department of Mathematics sences (1967) \ Dual div..-cemusste FG) =GTUD) 
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Bregman-Csiszar divergence (1991) 
az —logx—1 a=0 po as 
Foa(x) = ee es, 44) ao ea CS, gies fa(«) 
a(l—a) T a x ar— a a ~ ie . # 
Dual div. (decade) Dpr»(VF(p)||[VF(q)) = Dr(allp) pend 
-means 
Generalized Pythagoras’ theorem duality a 
. 7 7 aoe \ 
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Burbea-Rao 


Permissible Bregman divergences 
(Nock & Nielsen, 2007) 


(incl. Jensen-Shannon) 


+f 
LOV+S(a) _ 5 ( 


Information geometries 


Amari a-divergence (1985) 


x log a 
—logaz 
= 1l+a 
4 oor 
faa (l-2x 2 ) 


Quantum geometry 


Quantum entropy 
S(p) = —kTr(p log p) 
(Von Neumann 1927) 


pyo) 
Z 


Jp(psq) = 


Non-additive entropy 


Tsallis entropy (1998) 
(Non-additive entropy) 
Ta(P) = gag (0, P? — YD 


T.(p\lq) = <2. = J a=) 


Log Det divergence 


D(P||Q) =< P,Q™' > —logdet PQ™* 


arth mover distance 
(EMD 1998) 


Algorithmic geometry? 


Distance between two algorithms ? 
tT 


— dimP 


Von Neumann divergence 
D(P||Q) = Tr(P(log P — log Q) — P + Q) 


Sony CSL 


Kolmogorov complexity 
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